Training Recurrent Neural Networks by Diffusion

نویسنده

  • Hossein Mobahi
چکیده

This work presents a new algorithm for training recurrent neural networks (although ideas are applicable to feedforward networks as well). The algorithm is derived from a theory in nonconvex optimization related to the diffusion equation. The contributions made in this work are two fold. First, we show how some seemingly disconnected mechanisms used in deep learning such as smart initialization, annealed learning rate, layerwise pretraining, and noise injection (as done in dropout and SGD) arise naturally and automatically from this framework, without manually crafting them into the algorithms. Second, we present some preliminary results on comparing the proposed method against SGD. It turns out that the new algorithm can achieve similar level of generalization accuracy of SGD in much fewer number of epochs.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust stability of stochastic fuzzy impulsive recurrent neural networks with\ time-varying delays

In this paper, global robust stability of stochastic impulsive recurrent neural networks with time-varyingdelays which are represented by the Takagi-Sugeno (T-S) fuzzy models is considered. A novel Linear Matrix Inequality (LMI)-based stability criterion is obtained by using Lyapunov functional theory to guarantee the asymptotic stability of uncertain fuzzy stochastic impulsive recurrent neural...

متن کامل

A Monte Carlo EM Approach for Partially Observable Diffusion Processes: Theory and Applications to Neural Networks

We present a Monte Carlo approach for training partially observable diffusion processes. We apply the approach to diffusion networks, a stochastic version of continuous recurrent neural networks. The approach is aimed at learning probability distributions of continuous paths, not just expected values. Interestingly, the relevant activation statistics used by the learning rule presented here are...

متن کامل

Multi-Step-Ahead Prediction of Stock Price Using a New Architecture of Neural Networks

Modelling and forecasting Stock market is a challenging task for economists and engineers since it has a dynamic structure and nonlinear characteristic. This nonlinearity affects the efficiency of the price characteristics. Using an Artificial Neural Network (ANN) is a proper way to model this nonlinearity and it has been used successfully in one-step-ahead and multi-step-ahead prediction of di...

متن کامل

Solving Linear Semi-Infinite Programming Problems Using Recurrent Neural Networks

‎Linear semi-infinite programming problem is an important class of optimization problems which deals with infinite constraints‎. ‎In this paper‎, ‎to solve this problem‎, ‎we combine a discretization method and a neural network method‎. ‎By a simple discretization of the infinite constraints,we convert the linear semi-infinite programming problem into linear programming problem‎. ‎Then‎, ‎we use...

متن کامل

Fault Detection and Location in DC Microgrids by Recurrent Neural Networks and Decision Tree Classifier

Microgrids have played an important role in distribution networks during recent years.  DC microgrids are very popular among researchers because of their benefits. Protection is one of the significant challenges in the way of microgrids progress. As a result, in this paper, a fault detection and location scheme for DC microgrids is proposed. Due to advances in Artificial Intelligence (AI) and s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1601.04114  شماره 

صفحات  -

تاریخ انتشار 2016